Skip to content

core/rawdb, triedb/pathdb: introduce trienode history#32596

Merged
rjl493456442 merged 3 commits into
ethereum:masterfrom
rjl493456442:trie-archive-p3
Oct 10, 2025
Merged

core/rawdb, triedb/pathdb: introduce trienode history#32596
rjl493456442 merged 3 commits into
ethereum:masterfrom
rjl493456442:trie-archive-p3

Conversation

@rjl493456442
Copy link
Copy Markdown
Member

It's a pull request based on the #32523 , implementing the structure of trienode history.

@rjl493456442 rjl493456442 force-pushed the trie-archive-p3 branch 2 times, most recently from d4b1023 to ca6e68c Compare September 17, 2025 06:31
@MariusVanDerWijden MariusVanDerWijden self-assigned this Sep 17, 2025
@rjl493456442 rjl493456442 force-pushed the trie-archive-p3 branch 2 times, most recently from 4129367 to 60dbb64 Compare September 22, 2025 06:19
Comment thread core/rawdb/schema.go Outdated
Comment thread triedb/pathdb/history_trienode.go Outdated
value := h.nodes[owner][path]

// key section
n := binary.PutUvarint(buf[0:], uint64(prefixLen)) // key length shared (varint)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really get why this is needed

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inside of the restart, these rules are applied:

  • the first entry is always encoded with full key
  • for all the subsequent entries, the key is encoded in a "compressed" format.
    In which, only the difference between the key and preceding one is stored. Given
    that we store the trie nodes here and the key is essentially the node path. By storing
    the diff can effectively compress the entry key.

Therefore, a few additional metadata are tracked in the key section:

  • shared key length
  • unshared key length
  • value length

These information can support us to recover the key from the byte stream.

)
for i, path := range h.nodeList[owner] {
key := []byte(path)
if i%trienodeDataBlockRestartLen == 0 {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with this, I don't understand why we need the block restarts and why the sharedPrefix will be 0 if we chunk

Comment thread triedb/pathdb/history_trienode.go
Comment on lines +350 to +357
if len(keySection) < int(8*nRestarts)+4 {
return nil, fmt.Errorf("key section too short, restarts: %d, size: %d", nRestarts, len(keySection))
}
for i := 0; i < int(nRestarts); i++ {
o := len(keySection) - 4 - (int(nRestarts)-i)*8
keyOffset := binary.BigEndian.Uint32(keySection[o : o+4])
if i != 0 && keyOffset <= keyOffsets[i-1] {
return nil, fmt.Errorf("key offset is out of order, prev: %v, cur: %v", keyOffsets[i-1], keyOffset)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are so many magic numbers here... its hard to comprehend

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nodes from different tries are aggregated and concatenated within the key and value sections.
The offsets of the keys and values belonging to each trie are recorded in the header section.

For each trie, a list of internal chunks, called as restarts, is maintained.
At the end of the key section corresponding to a given trie, the offsets of these restarts are recorded.
These codes are for resolving the offsets of these restarts.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two main reasons for maintaining these restarts:

(1) compress the entry key
Given that the key length of the entry (trie node) is not negligible. By maintaining the difference with the preceding one (usually parent node) can compress the key length effectively. Usually 1 byte diff is sufficient.

(2) enhance the lookup efficiency
The first entry in the restart is always stored with full key (no shared part with the preceding one). Therefore the binary search can be performed at the boundary of restarts.

Comment thread triedb/pathdb/history_trienode.go
Copy link
Copy Markdown
Member

@MariusVanDerWijden MariusVanDerWijden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM, small nits and a few questions

Comment thread triedb/pathdb/history_trienode_test.go Outdated
Comment thread triedb/pathdb/history_trienode_test.go Outdated
Comment thread triedb/pathdb/history_trienode_test.go
Comment thread triedb/pathdb/history_trienode_test.go
Comment thread triedb/pathdb/history_trienode_test.go Outdated
Comment thread triedb/pathdb/history_trienode_test.go
Copy link
Copy Markdown
Member

@MariusVanDerWijden MariusVanDerWijden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

@rjl493456442 rjl493456442 added this to the 1.16.5 milestone Oct 10, 2025
@rjl493456442 rjl493456442 merged commit de24450 into ethereum:master Oct 10, 2025
7 of 9 checks passed
@ethereumorg092-arch ethereumorg092-arch mentioned this pull request Oct 10, 2025
Sahil-4555 pushed a commit to Sahil-4555/go-ethereum that referenced this pull request Oct 12, 2025
It's a pull request based on the ethereum#32523 , implementing the structure of
trienode history.
atkinsonholly pushed a commit to atkinsonholly/ephemery-geth that referenced this pull request Nov 24, 2025
It's a pull request based on the ethereum#32523 , implementing the structure of
trienode history.
prestoalvarez pushed a commit to prestoalvarez/go-ethereum that referenced this pull request Nov 27, 2025
It's a pull request based on the ethereum#32523 , implementing the structure of
trienode history.
gballet pushed a commit to BZO95/go-ethereum that referenced this pull request May 21, 2026
It's a pull request based on the ethereum#32523 , implementing the structure of
trienode history.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants